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EOSDIS and CMR 


Earth Observing System Data and Information System (EOSDIS) manages NASA’s Earth science 
data 


¢ Ever growing collection of data is archived and distributed by 12 Distributed Active Archive 
Centers (DAACs) 


¢ Nearly 7,000 collections and 370 million granules are described by metadata housed in the 
Common Metadata Repository (CMR) 


¢ Data is described using a number of different metadata standards, and core elements of each 
standard are mapped to and from a common model — the Unified Metadata Model (UMM) 
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Earthdata Search 


¢ The Earthdata Search Client uses metadata in the CMR to present users with the information 
they are looking for and hand users off to more specific applications 


o Are users finding the information they are looking for? If not, why? 
o Are users being handing off to more specific applications? If not, why? 
¢ Poor quality metadata is often the answer 
¢ The CMR functions best when the metadata it houses is complete, consistent, and accurate 


¢ Let’s examine real examples of “less than ideal” metadata and the consider the 
consequences 


Sea rc h an d D j scove ry Collection metadata must accurately 


describe all, not some, of the child granules. 
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Q. Wide Field Camera (WFC) ——> _ 165K granules 
Q. Imaging Infrared Radiometer (IIR) ———> 436K granules 


LIDAR 
Q. Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) ——> granule jg granules 


More than % of the granules are not 
ie Terra/MISR described by the parent collection. 


1999-12-18 2014-12-18 


Collection metadata range 


2007-06-01 40,000 hits 


15,000 hits 2017-12-11 
0 hits 


Actual granule range 


1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 


Accessibility 


¢ Can | access the data via direct download? 
¢ Served correct data? 


¢ Served all data requested? 


Usability Tt 


¢ Does the metadata enable users to be handed off to online documentation? 


¢ User’s guides, README files, ATBDs, FAQ pages, product quality assessments, etc. 


What is metadata curation? 


Traditional curation Information Age web content curation 
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Digital curation “Digital curation involves maintaining, preserving and adding value 
to digital research data throughout its lifecycle.” 


“...curation enhances the long-term value of existing data by 
making it available for further high quality research.” 


Digital Curation Center, Edinburgh, Scotland 
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Analysis and Review of CMR (ARC) Team 


¢ All have been or currently are 
users of NASA Earth Science data 
for research applications 


¢ Backgrounds in Earth science, 
atmospheric science, space 
science, and remote sensing 


¢ Previous experience from the 
Climate Data Initiative (CDI) 
o Review of 850 metadata 
records for quality and 
accessibility 


ARC’s approach to digital curation 


Compliance 


Required elements 
Controlled vocabulary 
Broken URLs 

UMM usage 


DOIs 


Compliance + Content 


Accuracy 

Consistency across collections 
Addition of new information 
Comprehensibility 


Keyword relevancy 


ARC Curation Process 


2 curators each 


Import collection 
metadata record 
from CMR 


Perform automated 


perform a manual 


compliance review 
content review 


Process is repeated for 1 randomly selected granule (when granule exists) 


Collection and granule 
findings are delivered to the 
data center 


Enables the data center to 


begin incrementally updating 
its records 


ARC Curation Process 


Resolve collection and 
granule metadata 
content issues 


DAAC performs 
Discuss UMM incremental 
evolution and metadata 
brainstorm new improvements 
Earthdata Search 


Client functionalities 


DAAC ingests 
improved 
CMR Team metadata into 
CMR 


Stakeholders collaborate to address both 
DAAC-specific and EOSDIS-wide issues 


ARC Curation Process 


ung aa : High —E Inaccurate, incomplete, or missing content 
Priority classification scheme "g + Broken URLs and invalid collection-granule relationships 
o Assist DAAC in formulating a strategic * Revisions of existing content 
plan to address findings * Addition of new information 


¢« ARC submits finding to DAACs 


o Overview report (Identifies DAAC-wide issues) 
o Detailed reports (Identify record-specific issues) 


¢ DAAC submits a report to ESDIS on a strategy and timeline devised to work off findings 
¢ DAAC works off findings with the ARC and CMR teams available for support 


¢ DAAC alters internal processes as needed to ensure adherence to EOSDIS policies and 
best practices moving forward 


Phase | 
¢ Mid 2016 to late 2017 


¢ Records from all 12 
DAACs reviewed 


* 1,959 collections 
reviewed 


¢ GHRC, ASF, and CDDIS 
fully reviewed 


* Supported CDDIS and 
SEDAC in the 
generation of brand new 
collection and granule 
metadata 


ARC Collection Reviews Ending December 2017 
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CDDIS 90 <— DAAC’s Total Collection Holdings 


Collections Reviewed 


1,141 


1,131 


1,288 


1,281 


Key Outcomes from Phase 1 
¢ Evaluation of updated metadata for ORNL and SEDAC 
Initial Follow- 
ORNL Review Review 
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Brand new granule metadata achieved a passing rate of 94% (Average initial 
granule passing rate is 65%) 


Phase Il 


* ARC reviews will transition to an online dashboard environment 
— Improve ARC/DAAC communication 
— Enable automated metric tracking 


Phase | 
¢ Implement amore strategic approach to ~2,000 collections 


ARC delivery of findings 
¢ Track DAAC improvements from Phase | 


¢ Improve UMM documentation and provide 
new reference resources for metadata 
authors 


¢ Document and disseminate best practices 


that have emerged from the curation effort Phase II 
~5,000 collections 


Questions 


Adam Sisco 


adam.sisco@nsstc.uah.edu 
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